Integrating Talend Big Data Batch Jobs with MongoDB

 

Open solution

 

Category

MongoDB, Spark

Prerequisites

Talend Big Data Basics, Talend Big Data - Spark Batch, Knowledge of Apache Spark, MongoDB collections, MongoDB query language and aggregation, Docker containers

Third-party software

Apache Spark, MongoDB, Docker

Description

 

 

Talend offers different components and approaches that make it easier to create collections and articulate process queries to extract information from MongoDB. Using Earthquake data files publicly available on the INGV Italian web site, the solution template builds a Talend Spark Big Data Batch Job to demonstrate a real use case. It shows you how downloaded data is pushed to a single collection, then prepared and reused for analysis.